Microbial Genomics
● Microbiology Society
All preprints, ranked by how well they match Microbial Genomics's content profile, based on 204 papers previously published here. The average preprint has a 0.11% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Stoesser, N.; Phan, H. T.; Seale, A. C.; Aiken, Z.; Thomas, S.; Smith, M.; Wyllie, D.; George, R.; Sebra, R.; Mathers, A. J.; Vaughan, A.; Peto, T. E.; Ellington, M. J.; Hopkins, K. L.; Crook, D. W.; Orlek, A.; Welfare, W.; Cawthorne, J.; Lenney, C.; Dodgson, A.; Woodford, N.; Walker, A. S.; TRACE Investigators' Group,
Show abstract
Carbapenem resistance in Enterobacterales is a public health threat. Klebsiella pneumoniae carbapenemase (encoded by alleles of the blaKPC family) is one of the commonest transmissible carbapenem resistance mechanisms worldwide. The dissemination of blaKPC has historically been associated with distinct K. pneumoniae lineages (clonal group 258 [CG258]), a particular plasmid family (pKpQIL), and a composite transposon (Tn4401). In the UK, blaKPC has caused a large-scale, persistent outbreak focused on hospitals in North-West England. This outbreak has evolved to be polyclonal and poly-species, but the genetic mechanisms underpinning this evolution have not been elucidated in detail; this study used short-read whole genome sequencing of 604 blaKPC-positive isolates (Illumina) and long-read assembly (PacBio)/polishing (Illumina) of 21 isolates for characterisation. We observed the dissemination of blaKPC (predominantly blaKPC-2; 573/604 [95%] isolates) across eight species and more than 100 known sequence types. Although there was some variation at the transposon level (mostly Tn4401a, 584/604 (97%) isolates; predominantly with ATTGA-ATTGA target site duplications, 465/604 [77%] isolates), blaKPC spread appears to have been supported by highly fluid, modular exchange of larger genetic segments amongst plasmid populations dominated by IncFIB (580/604 isolates), IncFII (545/604 isolates) and IncR replicons (252/604 isolates). The subset of reconstructed plasmid sequences also highlighted modular exchange amongst non-blaKPC and blaKPC plasmids, and the common presence of multiple replicons within blaKPC plasmid structures (>60%). The substantial genomic plasticity observed has important implications for our understanding of the epidemiology of transmissible carbapenem resistance in Enterobacterales, for the implementation of adequate surveillance approaches, and for control.\n\nIMPORTANCEAntimicrobial resistance is a major threat to the management of infections, and resistance to carbapenems, one of the \"last line\" antibiotics available for managing drug-resistant infections, is a significant problem. This study used large-scale whole genome sequencing over a five-year period in the UK to highlight the complexity of genetic structures facilitating the spread of an important carbapenem resistance gene (blaKPC) amongst a number of bacterial species that cause disease in humans. In contrast to a recent pan-European study from 2012-2013(1), which demonstrated the major role of spread of clonal blaKPC-Klebsiella pneumoniae lineages in continental Europe, our study highlights the substantial plasticity in genetic mechanisms underpinning the dissemination of blaKPC. This genetic flux has important implications for: the surveillance of drug resistance (i.e. making surveillance more difficult); detection of outbreaks and tracking hospital transmission; generalizability of surveillance findings over time and for different regions; and for the implementation and evaluation of control interventions.
Slow, S.; Anderson, T.; Murdoch, D.; Bloomfield, S.; Winter, D.; Biggs, P.
Show abstract
Legionella longbeachae is an environmental bacterium that is commonly found in soil and composted plant material. In New Zealand (NZ) it is the most clinically significant Legionella species causing around two-thirds of all notified cases of Legionnaires disease. Here we report the sequencing and analysis of the geo-temporal genetic diversity of 54 L. longbeachae serogroup 1 (sg1) clinical isolates that were derived from cases from around NZ over a 22-year period, including one complete genome and its associated methylome. Our complete genome consisted of a 4.1 Mb chromosome and a 108 kb plasmid. The genome was highly methylated with two known epigenetic modifications, m4C and m6A, occurring in particular sequence motifs within the genome. Phylogenetic analysis demonstrated the 54 sg1 isolates belonged to two main clades that last shared a common ancestor between 108 BCE and 1608 CE. These isolates also showed diversity at the genome-structural level, with large-scale arrangements occurring in some regions of the chromosome and evidence of extensive chromosomal and plasmid recombination. This includes the presence of plasmids derived from recombination and horizontal gene transfer between various Legionella species, indicating there has been both intra-species and inter-species gene flow. However, because similar plasmids were found among isolates within each clade, plasmid recombination events may pre-empt the emergence of new L. longbeachae strains. Our high-quality reference genome and extensive genetic diversity data will serve as a platform for future work linking genetic, epigenetic and functional diversity in this globally important emerging environmental pathogen. Author SummaryLegionnaires disease is a serious, sometimes fatal pneumonia caused by bacteria of the genus Legionella. In New Zealand, the species that causes the majority of disease is Legionella longbeachae. Although the analyses of pathogenic bacterial genomes is an important tool for unravelling evolutionary relationships and identifying genes and pathways that are associated with their disease-causing ability, until recently genomic data for L. longbeachae has been sparse. Here, we conducted a large-scale genomic analysis of 54 L. longbeachae isolates that had been obtained from people hospitalised with Legionnaires disease between 1993 and 2015 from 8 regions around New Zealand. Based on our genome analysis the isolates could be divided into two main groups that persisted over time and last shared a common ancestor up to 1700 years ago. Analysis of the bacterial chromosome revealed areas of high modification through the addition of methyl groups and these were associated with particular DNA sequence motifs. We also found there have been large-scale rearrangements in some regions of the chromosome, producing variability between the different L. longbeacahe strains, as well as evidence of gene-flow between the various Legionella species via the exchange of plasmid DNA.
Gibbon, M. J.; Couto, N.; Cozens, K.; Habib, S.; Cowley, L.; Aanensen, D.; Corander, J.; Thorpe, H.; Hetland, M. A.; Sassera, D.; Merla, C.; Corbella, M.; Ferrari, C.; Turner, K. M.; Sirikancha, K.; Dulyayangkul, P.; Alhusein, N.; Charoenlap, N.; Thamlikitikul, V.; Avison, M. B.; Feil, E. J.
Show abstract
BackgroundKlebsiella pneumoniae (Kp) is an important pathogen of humans and animals, and recent reports of convergent strains that carry both virulence and antimicrobial resistance genes (ARGs) have raised serious public health concern. The plasmid-borne iuc locus, encoding the siderophore aerobactin, is a key virulence factor in this species. The variant iuc3 is associated with porcine and human clinical isolates and is carried by mostly uncharacterised IncF plasmids. MethodsWe used a combination of short-read and long-read sequencing to characterise IncFIB(K)/IncFII iuc3-carrying plasmids harboured by 79 Kp isolates and one K. oxytoca isolate recovered as part of two large One-Health studies in Italy (SpARK) and Thailand (OH-DART). Adding data from public repositories gave a combined dataset of 517 iuc3 isolates, and the plasmids were analysed using both clustering and phylogenetic methods. FindingsWe note seven large, convergent, plasmids from Thailand that have emerged through the hybridisation of co-circulating plasmids harbouring iuc3 and antimicrobial resistance genes (ARGs) encoding extended-spectrum beta-lactamases (ESBLs). We were also able to identify putative parental plasmids which were mostly associated with two neighbouring meat markets, as were the hybrid plasmids. Clustering and global phylogenetic analyses resolved an iuc3 plasmid sub-group circulating throughout Asia, with occasional examples in Europe and elsewhere. This variant carries multiple ARGs and is commonly harboured by clinical isolates, thus warranting targeted plasmid surveillance. InterpretationOur study reveals that plasmid hybridisation leading to the convergence of resistance and virulence traits may be very common, even in non-clinical ( One-Health) settings. Population-scale plasmid genomics makes it possible to identify putative parental plasmids, which will help to identify plasmid types that are most likely to hybridise, and what the selective consequences may be for the plasmid and host. A distinct iuc3 plasmid sub-variant is associated with clinical isolates in Asia which requires close monitoring. Research In ContextMultiple reports of convergent clones of Klebsiella pneumoniae that combine both hypervirulence and multidrug resistance (MDR-hvKp) have been published recently; a PubMed search in November 2023 using the key words convergence Klebsiella pneumoniae returned 143 papers, 99 of which were published from 2020 onwards. Our study demonstrates that the hybridisation of plasmids carrying AMR and virulence genes is a frequent, ongoing, process in natural populations. The subsequent transfer of plasmids conferring both traits is thus likely to be a key driver behind the spread of convergent strains. Our study also provides an exemplar of how hybrid assemblies can facilitate large-scale global genomic plasmid epidemiology. Evidence before the studyAlthough multiple recent reports highlight the emergence and spread of convergent Kp strains, the confluence of resistance and virulence genes within the same plasmid has not been studied at a population level, and putative parental plasmids are rarely identified. Moreover, there have been few high-resolution genomic epidemiology studies on closely related plasmids using both long and short-read data on a global scale. Added valueWe more than double the number of complete sequences available for plasmids harbouring iuc3 from 58 to 139 and provide evidence on the host lineages most likely to harbour these plasmids (e.g., ST35), and epidemiological source (e.g., pig, wild animal, human). Our comparative analysis of phylogenetic and clustering approaches will help to inform future plasmid epidemiological studies. ImplicationsThe hybridisation of plasmids harbouring virulence and resistance genes occurs frequently in natural populations, even within One-Health settings. However, the selective drivers (if any) and evolutionary consequences of this phenomenon are unclear. There is clear utility in generating closed plasmid genomes on a population scale, and targeted plasmid surveillance on a clinical sub-variant of iuc3 plasmids is warranted.
Connor, C. H.; Wick, R. R.; Gorrie, C. L.; Ingle, D. J.; Lam, M. M.
Show abstract
The critical role of plasmids, particularly in the dissemination of AMR and virulence in nosocomial pathogens like Klebsiella pneumoniae, underpins the need for robust and scalable tools for plasmid identification and reconstruction from large-scale datasets of short-read sequence data that are routinely generated in clinical and public health settings. Here, we sought to evaluate all available tools to determine which are best suited for reconstructing the plasmidome of K. pneumoniae and related species from the species complex (KpSC). We used a publicly available dataset of 568 diverse KpSC isolates that had high-quality short-read Illumina data and matched complete genomes generated from hybrid assemblies incorporating additional long-read sequence data. This allowed us to investigate which tool perform best at recovering plasmid sequences when only short-read data are available. None of the tools tested offered a comprehensive or consistently reliable solution to assembling plasmids from short-read sequence data of KpSC. Among the six tools that were benchmarked, performance varied across total runtime, RAM usage, prediction accuracy and sensitivity. Future tools developed in this space should offer meaningful advancements over existing tools and be rigorously evaluated using large, standardised bacterial datasets that reflect the diversity and complexity of plasmid content to ensure comparability across benchmarking studies.
Cheney, L.; Payne, M.; Kaur, S.; Mckew, G.; Lan, R.
Show abstract
Staphylococcus aureus is a major source of both hospital and community acquired infections, and is the leading source of skin and soft tissue infections worldwide. Advances in whole genome sequencing (WGS) technologies have recently generated large volumes of S. aureus WGS data. The timely classification of S. aureus WGS data with genomic typing technologies has the potential to describe detailed genomic epidemiology at large and small scales. In this study, a multilevel genome typing (MGT) scheme comprised of 8 levels of multilocus sequence typing schemes of increasing resolution was developed for S. aureus and used to analyse 50,481 publicly available genomes. Application of MGT to S. aureus epidemiology was showcased in three case studies. Firstly, the population structure of the globally disseminated sequence type ST8 were described by MGT2, which was compared with Spa typing. Secondly, MGT was used to characterise MLST ST8 - PFGE USA300 isolates that colonised multiple body sites of the same patient. Unique STs from multiple MGT levels were able to group isolates of the same patient, and the highest resolution MGT8 separated isolates within a patient that varied in predicted antimicrobial resistance. Lastly, MGT was used to describe the transmission of MLST ST239 - SCCmec III throughout a single hospital. MGT STs were able to describe both isolates that had spread between wards and also isolates that had colonised different reservoirs within a ward. The S. aureus MGT describes large- and small-scale S. aureus genomic epidemiology with scalable resolutions using stable and standardised ST assignments. The S. aureus MGT database is online (https://mgtdb.unsw.edu.au/staphylococcus) and is capable of tracking new and existing clones to facilitate the design of new strategies to reduce the global burden of S. aureus related diseases.
Tumeo, A.; Kovarova, A.; McDonagh, F.; Ryan, K.; Clarke, C.; Miliotis, G.
Show abstract
Phytobacter is a recently delineated, frequently misidentified genus within the order Enterobacterales. Following two rare cases of patient colonization with multidrug resistant Phytobacter in Ireland, this study presents a genus-wide genomic analysis that aims to define the pathogenic potential of Phytobacter species, with emphasis on their role as emerging human pathogens and reservoirs of carbapenemases. Two carbapenemase-encoding isolates were recovered from rectal swabs in Ireland in 2024 and were initially identified as Phytobacter by MALDI-ToF. Whole-genome sequencing with in silico species typing (dDDH, ANI) provided definitive taxonomic resolution. A genus-wide maximum-likelihood core-genome phylogeny was reconstructed, and the plasmidome and resistome were bioinformatically profiled across all available Phytobacter genomes. Phenotypic susceptibility of the Irish isolates was determined through minimum inhibitory concentration (MIC) testing. The Irish isolates (P. diazotrophicus E787336 and P. ursingii E980862) are the first reported Phytobacter strains carrying both plasmid-borne blaIMP-4 and mcr-9.1 in the genus. MIC testing confirmed resistance to aztreonam, aminoglycosides, cephalosporins, fluoroquinolones, and the {beta}-lactam/{beta}-lactamase inhibitor combination piperacillin-tazobactam. Across 34 Phytobacter genomes examined, 22 distinct plasmid replicon types were identified in 22 isolates, often shared across species. The genus-wide resistome encompassed 71 genes, more than half predicted to be acquired, with carbapenemases detected in 26.5% (9/34) of the genomes. In summary, Phytobacter harbors a diverse, plasmid-borne resistome including carbapenemases, with documented cases of human colonization and infection. These findings support its recognition as an emerging pathogen and reservoir of antimicrobial resistance, underscoring the need for improved clinical identification, genomic surveillance, and preparedness for limited treatment options. Author summarySince Phytobacter was first characterized in 2007, this bacterial genus has mainly been associated with plant growth promotion. More recently, however, increasing reports of human infections have raised concerns about its potential as emerging bacterial pathogen. These are further underscored by the description of multidrug resistant isolates capable to withstand different classes of antimicrobials, including carbapenems which are often used as a last line of treatment. Following the rare finding of multidrug resistant Phytobacter in two patients in Ireland, we combined bioinformatics and laboratory testing to characterize the antimicrobial resistance profile of this overlooked bacterial genus. Our results uncover the variety of resistance determinants, including to carbapenems, which is encoded in the genomes of Phytobacter. This shows its potential as hidden reservoir of drug resistance and emerging bacterial pathogen. We encourage improved clinical recognition and monitoring of Phytobacter to better anticipate infections with limited therapeutic options.
Bennett, R. J.; De Silva, P. M.; Bengtsson, R. J.; Horsburgh, M. J.; Blower, T. R.; Baker, K. S.
Show abstract
Bacteria of the genus Shigella are a major contributor to the global diarrhoeal disease burden causing >200,000 deaths per annum globally where S. flexneri is the major pathogenic species. Increasing antimicrobial resistance (AMR) in Shigella and the lack of a licenced vaccine has led WHO to recognise Shigella as a priority organism for the development of new antimicrobials. Understanding what drives the long-term persistence and success of this pathogen is critical for ongoing shigellosis management and is relevant for other enteric bacteria. To identify key genetic drivers of Shigella evolution over the past 100 years, we analysed S. flexneri from the historical Murray collection (n=45, isolated between 1917-1954) alongside a comparatively modern collection (n=262, isolated between 1950-2011) using a novel approach called temporal genome-wide association study (tGWAS). We identified SNPs (n=94), COGs (n=359) and significant kmers within 48 genes significantly positively associated with time. These included T3SS encoding genes, proteins involved in intracellular competition, acquired antimicrobial resistance genes, insertion sequences, and genes of unknown function (28%, 49/172 of those hits investigated). Among the unknown proteins we identified a novel plasmid borne putative adhesin, named Stv. Genomic epidemiological analyses reveal that Stv was associated with clonal expansions of multiple phylogroups of S. flexneri and its acquisition predates multidrug resistance acquisition and the global dissemination of Lineage III S. sonnei. Stv, and close relatives, are widely distributed in other Enterobactericeae and bacteria, indicating that its importance likely extends beyond shigellae. This work highlights the effectiveness of using tGWAS on historical isolate collections for identifying novel contributors to pathogen success over time. This approach is readily translatable to other pathogens and our application in Shigella identified Stv, a putative adhesin and potential drug target that is widely distributed across the AMR priority group Enterobacteriaceae. Author SummaryShigellosis is a leading cause of diarrhoeal disease worldwide and is represented among the multiple Enterobacteriaceae which WHO have declared as priority pathogens for which new antimicrobials are urgently needed. The majority of shigellosis is caused by the species Shigella flexneri and Shigella sonnei. In this study we collated S. flexneri isolates that spanned a 94-year period, encompassing the pre- and post-antibiotic era to implement a novel bioinformatic technique, temporal GWAS (tGWAS), to identify key factors of pathogen success during this time period. Alongside recovering AMR and virulence genes, we also identified a novel, mobilisable putative adhesin, named Stv herein, which appeared to contribute to clonal expansions across multiple Shigella species and is present across the broader Enterobacteriaceae. Our results indicate the potential importance of Stv in controlling Shigella and other infections, and the validity of a tGWAS approach for identifying biological drivers underpinning the evolution and expansion of AMR pathogens over time.
Ovsepian, A.; Delgado-Blas, J.; Rethoret-Pasty, M.; Martin, M. J.; Lebreton, F.; Brisse, S.
Show abstract
BackgroundCore genome multilocus sequence typing (cgMLST) is a powerful method for bacterial strain genotyping. However, the size of the core genome decreases as the phylogenetic breadth of the target group increases, reducing discriminatory power. To overcome this discrimination/applicability tradeoff, here we developed a cumulative cgMLST approach, where sets of core loci conserved within nested phylogenetic entities are added. We illustrate this approach using the Klebsiella pneumoniae species complex (KpSC), for which a widely used cgMLST scheme (KpSC-cgMLST) comprises only 629 genes. MethodsWe created non-redundant cgMLST schemes for the individual species K. pneumoniae sensu stricto (Kpn-cgMLST scheme), and its multidrug resistant sublineages (SLs) SL147 and SL307. To extract core genes, we used 37,874 genome assemblies originating from over 80 countries worldwide. A methodology was set to filter redundant loci before importing them into the genotyping tool BIGSdb, where they were combined into schemes together with preexisting loci conserved at higher phylogenetic levels. The performance of the cumulative cgMLST schemes was evaluated on previously published datasets and on novel data from an inter-hospital outbreak of SL307. ResultsThe Kpn-cgMLST, SL147 and SL307 schemes comprise 2752, 852, and 947 additional loci, respectively. The mean allele call rate of the novel loci was >99% in the validation datasets. Compared to the KpSC scheme used alone, pairwise allelic distances among isolates increased on average 5.6-fold using the Kpn scheme, and further by 20% and 30% using the SL147 and SL307 schemes, respectively. We demonstrate the added value of this increased discriminatory power for epidemiological analyses and show nearly equal discrimination when compared to whole-genome single nucleotide polymorphisms analysis. ConclusionsThe cumulative cgMLST strategy combines broad phylogenetic applicability and nearly complete genotyping resolution, expanding the utility of this harmonized approach for genomic epidemiology.
Matlock, W.; Lipworth, S.; Chau, K. K.; AbuOun, M.; Barker, L.; Kavanagh, J.; Andersson, M.; Oakley, S.; Morgan, M.; Crook, D. W.; Read, D. S.; Anjum, M.; Shaw, L. P.; Stoesser, N.; REHAB Consortium,
Show abstract
Plasmids enable the dissemination of antimicrobial resistance (AMR) in common Enterobacterales pathogens, representing a major public health challenge. However, the extent of plasmid sharing and evolution between Enterobacterales causing human infections and other niches remains unclear, including the emergence of resistance plasmids. Dense, unselected sampling is highly relevant to developing our understanding of plasmid epidemiology and designing appropriate interventions to limit the emergence and dissemination of plasmid-associated AMR. We established a geographically and temporally restricted collection of human bloodstream infection (BSI)-associated, livestock-associated (cattle, pig, poultry, and sheep faeces, farm soils) and wastewater treatment work (WwTW)-associated (influent, effluent, waterways upstream/downstream of effluent outlets) Enterobacterales. Isolates were collected between 2008-2020 from sites <60km apart in Oxfordshire, UK. Pangenome analysis of plasmid clusters revealed shared "backbones", with phylogenies suggesting an intertwined ecology where well-conserved plasmid backbones carry diverse accessory functions, including AMR genes. Many plasmid "backbones" were seen across species and niches, raising the possibility that plasmid movement between these followed by rapid accessory gene change could be relatively common. Overall, the signature of identical plasmid sharing is likely to be a highly transient one, implying that plasmid movement might be occurring at greater rates than previously estimated, raising a challenge for future genomic One Health studies. FundingThis study was funded by the Antimicrobial Resistance Cross-council Initiative supported by the seven research councils and the NIHR, UK.
Lam, M. M.; Salisbury, S. M.; Treat, L. P.; Wick, R. R.; Judd, L. M.; Wyres, K. L.; Brisse, S.; Walker, K. A.; Miller, V. L.; Holt, K. E.
Show abstract
AbstractO_ST_ABSBackgroundC_ST_ABSKlebsiella pneumoniae is an opportunistic pathogen and a leading cause of healthcare-associated infections in hospitals, which are frequently antimicrobial resistant (AMR). Exacerbating the public health threat posed by K. pneumoniae, some strains also harbor additional hypervirulence determinants typically acquired via mobile genetic elements such as the well-characterised large virulence plasmid KpVP-1. The rmpADC locus is considered a key virulence feature of K. pneumoniae and is associated with upregulated capsule expression and the hypermucoid phenotype, which can enhance virulence by contributing to serum resistance. Typically such strains have been susceptible to all antimicrobials besides ampicillin, however the recent emergence of AMR hypermucoid strains is concerning. MethodsHere, we investigate the genetic diversity, evolution, mobilisation and prevalence of rmpADC, in a dataset of 14000 genomes from isolates of the Klebsiella pneumoniae species complex, and describe the RmST virulence typing scheme for tracking rmpADC variants for the purposes of genomic surveillance. Additionally, we examine the functionality of representatives for variants of rmpADC introduced into a mutant strain lacking its native rmpADC locus. ResultsThe rmpADC locus was detected in 7% of the dataset, mostly from genomes of K. pneumoniae and a very small number of K. variicola and K. quasipneumoniae. Sequence variants of rmpADC grouped into five distinct lineages (rmp1, rmp2, rmp2A, rmp3 and rmp4) that corresponded to unique mobile elements, and were differentially distributed across different populations (i.e. clonal groups) of K. pneumoniae. All variants were demonstrated to produce enhanced capsule production and hypermucoviscosity. ConclusionThese results provide an overview of the diversity and evolution of a prominent K. pneumoniae virulence factor and support the idea that screening for rmpADC in K. pneumoniae isolates and genomes is valuable to monitor the emergence and spread of hypermucoid K. pneumoniae, including AMR strains.
Pearse, O.; Zuza, A.; Lester, R.; Mangochi, H.; Siyabu, P.; Tewesa, E.; Edwards, T.; Thomson, N.; Feasey, N.; Kawaza, K.; Jewell, C.; Musicha, P.; Cornick, J.; Heinz, E.
Show abstract
Klebsiella pneumoniae is a frequent cause of antimicrobial resistant healthcare associated infections in neonates across sub-Saharan Africa, with multiple lineages associated with neonatal sepsis. However, the full diversity of circulating strains and key reservoirs facilitating transmission within hospitals is unknown. We investigated the population structure and within- sample diversity of K. pneumoniae in a Malawian neonatal unit. We recruited 94 mother- neonate pairs and collected regular stool samples, hand swabs, cot swabs and swaddling cloth samples. Additionally, we collected ward surface-swab samples and staff hand swabs weekly. To establish within sample diversity we employed a dual sequencing approach; (i) single colony picks from Extended-Spectrum Beta-Lactamase selective chromogenic agar for short-read whole genome sequencing; and (ii) post-enrichment metagenomics using plate sweeps from a non-selective agar. In total, we analysed 552 single-colony picks and 772 plate-sweeps from neonate, maternal and environmental samples. Comparing sequence types, surface antigens, antimicrobial resistance and virulence genes, and plasmid replicons, between sample types and sequencing approaches, we identified key advantages and limitations of post-enrichment metagenomics. Our approach revealed high diversity at both the ward and individual level, with a high proportion of the overall diversity likely due to Extended-Spectrum Beta-Lactamase negative organisms. ST15 and ST307 were found in high numbers using both methodologies, whilst ST14 was identified primarily from the non-selective post-enrichment metagenomic samples. Isolates and samples from ward surface swabs had more antimicrobial resistance genes and plasmid replicons than those isolated from human stool. This approach demonstrates the value of combining colony-based and metagenomic sequencing approaches, as a cost-effective alternative to shotgun metagenomics to study health care associated infections. Impact statementKlebsiella pneumoniae is an important cause of healthcare-associated infections (HAIs) in neonates, particularly in sub-Saharan Africa. While multiple strains can colonise a single person or surface, the full diversity of K. pneumoniae in the hospital environment is unknown. This study investigates ward-level and within-sample diversity of K. pneumoniae in a Malawian neonatal unit. We collected stool samples from babies and their mothers, as well as swabs from ward surfaces. To characterize bacterial diversity, we compared two approaches: (i) single- colony whole-genome sequencing of isolates selected under antimicrobial pressure and (ii) post-enrichment metagenomics, where entire microbial communities were sequenced from non- selective culture plates in a single run. We found high diversity of K. pneumoniae, with considerable diversity revealed by post-enrichment metagenomics. Samples frequently had multiple strains of K. pneumoniae within them and this varied by sample type. These data highlight the need to utilize methods that account for within-sample diversity when investigating K. pneumoniae, indicate that single colony WGS is inadequate and a combination of post- enrichment metagenomics and single colony WGS is preferable.
Zhi, X.; Vieira, A.; Huse, K. K.; Martel, P. J.; Lobkowicz, L.; Li, H. K.; Croucher, N.; Andrew, I.; Game, L.; Sriskandan, S.
Show abstract
Background & AimsThe standalone regulator RofA is a positive regulator of the pilus locus in Streptococcus pyogenes. Found in only certain emm genotypes, RofA has been reported to regulate other virulence factors, although its role in the globally dominant emm1 S. pyogenes is unclear. Given the recent emergence of a new emm1 (M1UK) toxigenic lineage that is distinguished by three non-synonymous SNPs in rofA, we characterized the rofA regulon in six emm1 strains, that are representative of the two contemporary major emm1 lineages (M1global and M1UK) using RNAseq analysis, and then determined the specific role of the M1UK-specific rofA SNPs. ResultsDeletion of rofA in three M1global strains led to altered expression of 14 genes, including six non-pilus locus genes. In M1UK strains, deletion of rofA led to altered expression of 16 genes, including 9 genes that were unique to M1UK. Only the pilus locus genes were common to the RofA regulons of both lineages, while transcriptomic changes varied between strains even within the same lineage. Although introduction of the 3 SNPs into rofA did not impact gene expression in an M1global strain, reversal of 3 SNPs in an M1UK strain led to an unexpected number of transcriptomic changes that in part recapitulated transcriptomic changes seen when deleting RofA in the same strain. Computational analysis predicted interactions with a key histidine residue in the PRD domain of RofA would differ between M1UK and M1global. SummaryRofA is a positive regulator of the pilus locus in all emm1 strains but effects on other genes are strain- and lineage-specific, with no clear, common DNA binding motif. The SNPs in rofA that characterize M1UK may impact regulation of RofA; whether they alter phosphorylation of the RofA PRD domain requires further investigation. Author summaryRofA belongs to the group of "mga-like" bacterial regulatory proteins that comprise a DNA binding domain as well as a phosphorylation domain (PRD) that is responsive to changes in sugar availability. In certain emm genotypes of Streptococcus pyogenes, rofA sits upstream of the pilus locus, to act as a positive regulator. The recent emergence of a SpeA exotoxin-producing sublineage of emm1 S. pyogenes, (M1UK) has focused attention on the role of RofA; M1UK and its associated sublineages are characterized by 3 non-synonymous SNPs in rofA, that include adjacent SNPs in the PRD domain. Here, we determine the impact of rofA deletion and the 3 rofA SNPs in both the widely disseminated M1global clone and the newly emergent M1UK clone. While production of SpeA undoubtedly contributes to infection pathogenesis, the evolution of M1UK points to a role for metabolic regulatory rewiring in success of this lineage.
Tsang, K. K.; Lam, M. M. C.; Wick, R. R.; Wyres, K. L.; Bachman, M. A.; Baker, S.; Barry, K.; Brisse, S.; Campino, S.; Chiaverini, A.; Cirillo, D. M.; Clark, T. G.; Corander, J.; Corbella, M.; Cornacchia, A.; Cuenod, A.; D'Alterio, N.; Di Marco, F.; Donado-Godoy, P.; Egli, A.; Farzana, R.; Feil, E. J.; Fostervold, A.; Gorrie, C. L.; Gütlin, Y.; Hassan, B.; Hetland, M. A. K.; Hoa, L. N. M.; Hoi, L. T.; Howden, B.; Ikhimiukor, O. O.; Jenney, A. W.; Kaspersen, H.; Khokhar, F.; Leangapichart, T.; Ligowska-Marzeta, M.; Löhr, I. H.; Long, S. W.; Mathers, A. J.; McArthur, A. G.; Nagaraj, G.; Oaikhe
Show abstract
Interpreting phenotypes of blaSHV alleles in Klebsiella pneumoniae genomes is complex. While all strains are expected to carry a chromosomal copy conferring resistance to ampicillin, they may also carry mutations in chromosomal blaSHV alleles or additional plasmid-borne blaSHV alleles that have extended-spectrum {beta}-lactamase (ESBL) activity and/or {beta}-lactamase inhibitor (BLI) resistance activity. In addition, the role of individual mutations/amino acid changes is not completely documented or understood. This has led to confusion in the literature and in antimicrobial resistance (AMR) gene databases (e.g., NCBIs Reference Gene Catalog and the {beta}-lactamase database (BLDB)) over the specific functionality of individual SHV protein variants. Therefore, identification of ESBL-producing strains from K. pneumoniae genome data is complicated. Here, we reviewed the experimental evidence for the expansion of SHV enzyme function associated with specific amino-acid substitutions. We then systematically assigned SHV alleles to functional classes (wildtype, ESBL, BLI-resistant) based on the presence of these mutations. This resulted in the re-classification of 37 SHV alleles compared with current assignments in NCBIs Reference Gene Catalog and/or BLDB (21 to wildtype, 12 to ESBL, 4 to BLI-resistant). Phylogenetic and comparative genomic analyses support that; i) SHV-1 (encoded by blaSHV-1) is the ancestral chromosomal variant; ii) ESBL and BLI-resistant variants have evolved multiple times through parallel substitution mutations; iii) ESBL variants are mostly mobilised to plasmids; iv) BLI-resistant variants mostly result from mutations in chromosomal blaSHV. We used matched genome-phenotype data from the KlebNET-GSP Genotype-Phenotype Group to identify 3,999 K. pneumoniae isolates carrying one or more blaSHV alleles but no other acquired {beta}-lactamases, with which we assessed genotype-phenotype relationships for blaSHV. This collection includes human, animal, and environmental isolates collected between 2001 to 2021 from 24 countries across six continents. Our analysis supports that mutations at Ambler sites 238 and 179 confer ESBL activity, while most omega-loop substitutions do not. Our data also provide direct support for wildtype assignment of 67 protein variants, including eight that were noted in public databases as ESBL. We reclassified these eight variants as wildtype, because they lack ESBL-associated mutations, and our phenotype data support susceptibility to 3GCs (SHV-27, SHV-38, SHV-40, SHV-41, SHV-42, SHV-65, SHV-164, SHV-187). The approach and results outlined here have been implemented in Kleborate v2.4.1 (a software tool for genotyping K. pneumoniae from genome assemblies), whereby known and novel blaSHV alleles are classified based on causative mutations. Kleborate v2.4.1 was also updated to include ten novel protein variants from the KlebNET-GSP dataset and all alleles in public databases as of November 2023. This study demonstrates the power of sharing AMR phenotypes alongside genome data to improve understanding of resistance mechanisms. Impact statementSince every K. pneumoniae genome has an intrinsic SHV {beta}-lactamase and may also carry additional mobile forms, the correct interpretation of blaSHV genes detected in genome data can be challenging and can lead to K. pneumoniae being misclassified as ESBL-producing. Here, we use matched K. pneumoniae genome and drug susceptibility data contributed from dozens of studies, together with systematic literature review of experimental evidence, to improve our understanding of blaSHV allele variation and mapping of genotype to phenotype. This study shows the value of coordinated data sharing, in this case via the KlebNET-GSP Genotype-Phenotype Group, to improve our understanding of the evolutionary history and functionality of blaSHV genes. The results are captured in an open-source AMR dictionary utilised by the Kleborate genotyping tool, that could easily be incorporated into or used to update other tools and AMR gene databases. This work is part of the wider efforts of the KlebNET-GSP group to develop and support a unified platform tailored for the analysis and interpretation of K. pneumoniae genomes by a wide range of stakeholders. Data summaryBlaSHV allele sequences and class assignments are distributed with Kleborate, v2.4.1, DOI:10.5281/zenodo.10469001. Table S1 provides a summary of blaSHV alleles, including primary accessions, class-modifying mutations, and supporting evidence for class assignments that differ from NCBIs Reference Gene Catalog or BLDB. Whole genome sequence data are publicly available as reads and/or assemblies, individual accessions are given in Table S2; corresponding genotypes and antibiotic susceptibility phenotypes and measurements are available in Tables S3 and S4, respectively.
Alhejaili, A.; Alkherd, U. H.; Safran Alzaidi, A. A.; Rajeh, A. A. A.; Almutairi, M. M.; Alabbad, S. S.; Aljurayan, A. N.; Alzeyadi, Z. A.; Alajel, S. M.; Hang, P.; Alotaiby, M. A.; Alghoribi, M.; Pain, A.; Abdul Salam, W.; Huang, J.; Milner, M.; Zhou, G.; Alzahrani, D.; Bukhari, D.; Aljasham, A. T.; Pain, A.; Banzhaf, M.; Moradigaravand, D.
Show abstract
The Gram-negative bacterium Klebsiella pneumoniae is a major human health threat underlying a broad range of community- and hospital-associated infections. The emergence of clonal hypervirulent strains often resistant to last-resort antimicrobial agents has become a global burden as treatment options are limited. The Kingdom of Saudi Arabia (KSA) has a dynamic and diverse population and serves as a major global tourist hub facilitating the dissemination of multidrug-resistant (MDR) strains of K. pneumoniae. To examine the spread of clinically relevant Klebsiella pneumoniae strains, we characterized the population structure and dynamics of multidrug-resistant K. pneumoniae across the KSA hospitals. MethodsWe conducted a large genomic survey on a Saudi Arabian collection of multidrug-resistant K. pneumoniae isolates from bloodstream and urinary tract infections. The isolates were collected from 32 hospitals located in 15 major cities across the country in 2022 and 2023. We subjected 352 K. pneumoniae isolates to whole-genome sequencing and employed a broad range of genomic epidemiological and phylodynamic methods to analyse population structure and dynamics at high resolution. We employed an integrated short- and long-read sequencing approach to fully characterize multiple plasmids carrying resistance and virulence genes. ResultsOur results indicate that, despite a diverse K. pneumoniae population underlying hospital infections, the population is characterized by the rapid expansion of a few dominant clones, including ST096 (n=115), ST147 (n=75), ST231 (n=35), ST101 (n=30), ST11 (n=18), ST16 (n=15) and ST14 (n=12). ST2096, ST231, and ST147 clones were estimated to have formed within the past two decades and spread between hospitals across the country on an epidemiological time scale. All STs were genetically linked with globally circulating clones, particularly strains from the Middle East and South Asia. All of the major clones harboured plasmid-borne ESBLs and a range of carbapenemase genes. Plasmidome analysis revealed multiple mosaic plasmids with resistance and virulence gene cassettes, some of which were shared between the major clades and account for multidrug resistant hypervirulent (MDR/hv) phenotype, especially in the ST2096 strains. Integration of phylodynamic data and resistance plasmid profiles showed that the acquisition of plasmids occurred on the same time scales as did the expansion of major clones across the country. ConclusionTaken together, these results indicate the dissemination of MDR and MDR-hv K. pneumoniae strains across the kingdom and provide evidence for pervasive plasmid sharing and horizontal gene transfer of resistance genes. The results demonstrate the independent introduction of endemic ST147, ST231, and ST101 clones into the country and highlight the clinical significance of ST2096 as an emerging clone with dual resistance and virulence risks. These results highlight the need for continuous surveillance of circulating and newly emergent strains (STs) and of their plasmidome footprints carrying MDR determinants.
Pavlovikj, N.; Carlos Gomes-Neto, J.; Deogun, J. S.; Benson, A. K.
Show abstract
Epidemiological surveillance of bacterial pathogens requires real-time data analysis with a fast turn-around, while aiming at generating two main outcomes: 1) Species level identification; and 2) Variant mapping at different levels of genotypic resolution for population-based tracking, in addition to predicting traits such as antimicrobial resistance (AMR). With the recent advances and continual dissemination of whole-genome sequencing technologies, large-scale population-based genotyping of bacterial pathogens has become possible. Since bacterial populations often present a high degree of clonality in the genomic backbone (i.e., low genetic diversity), the choice of genotyping scheme can even facilitate the understanding of ancestral relationships and can be used for prediction of co-inherited traits such as AMR. Multi-locus sequence typing (MLST) fits that purpose and can identify sequence types (ST) based on seven ubiquitous genome-scattered loci that aid in genotyping isolates beneath the species level. ST-based mapping also standardizes genotyping across laboratories and is used by laboratories worldwide. However, algorithms for inferring ST from Illumina paired-end sequencing data typically rely on genome assembly prior to classification. Genome assembly is computationally intensive and is a bottleneck for speed and scalability, which are important aspects of genomic epidemiology. The stringMLST program uses an assembly-free, kmer-based algorithm for inferring STs, which can overcome the speed and scalability bottlenecks. Here we have systematically studied the accuracy and scalability of stringMLST relative to the standard MLST program across a wide array of phylogenetically divergent Public Health-relevant bacterial pathogens. Our data shows that optimal kmer length for stringMLST is species-specific and that genome-intrinsic and -extrinsic features can affect performance and accuracy of the program. While suitable parameters could be identified for most organisms, there were a few instances where this program may not be directly deployable in its current format. More importantly, we integrated stringMLST into our freely available and scalable hierarchical-based population genomics platform, ProkEvo, and further demonstrated how the implementation facilitates automated, reproducible bacterial population analysis. The ProkEvo implementation provides a rapidly deployable genomic epidemiology tool for ST mapping along with other pan-genomic data mining strategies, while providing specific guidance on how to optimize stringMLST performance for a wide variety of bacterial pathogens.
Snaith, A. E.; Dunn, S. J.; Moran, R. A.; Newton, P.; Dance, D.; Davong, V.; Keunzli, E.; Kantele, A.; Corander, J.; McNally, A.
Show abstract
Increased colonisation by antimicrobial resistant organisms is closely associated with international travel. This study investigated the diversity of mobile genetic elements involved with antimicrobial resistance (AMR) gene carriage in extended-spectrum beta-lactamase (ESBL) -producing Escherichia coli that colonised travellers to Laos. Long-read sequencing was used to reconstruct complete plasmid sequences from 49 isolates obtained from the daily stool samples of 23 travellers over a three-week period. This method revealed a collection of 105 distinct plasmids, 38.1% of which carried AMR genes. The plasmids in this population were diverse, mostly unreported and included 38 replicon types, with F-type plasmids (n=22) the most prevalent amongst those carrying AMR genes. Fine-scale analysis of all plasmids identified numerous AMR gene contexts and emphasised the importance of IS elements, specifically members of the IS6/IS26 family, in the creation of complex multi-drug resistance regions. We found a concerning convergence of ESBL and colistin resistance determinants, with three plasmids from two different F-type lineages carrying blaCTX-M and mcr genes. The extensive diversity seen here highlights the worrying probability that stable new vehicles for AMR will evolve in E. coli populations that can disseminate internationally through travel networks. Impact StatementThe global spread of AMR is closely associated with international travel. AMR is a severe global concern and has compromised treatment options for many bacterial pathogens, among them pathogens carrying ESBL and colistin resistance genes. Colonising MDR organisms have the potential to cause serious consequences. Infections caused by MDR bacteria are associated with longer hospitalisation, poorer patient outcomes, greater mortality, and higher costs compared to infections with susceptible bacteria. This study elucidates the numerous different types of plasmids carrying AMR genes in colonising ESBL-producing E. coli isolates found in faecal samples from in travellers to Vientiane, Laos. Here we add to known databases of AMR plasmids by adding these MDR plasmids found in Southeast Asia, an area of high AMR prevalence. We characterised novel AMR plasmids including complex ESBL (blaCTX-M) and colistin (mcr) resistance co-carriage plasmids, emphasising the potential exposure of travellers to Laos to a wide variety of mobile genetic elements that may facilitate global AMR spread. This in-depth study has revealed further detail of the numerous factors that may influence AMR transfer, therefore potential routes of AMR spread internationally, and is a step towards finding methods to combat AMR spread. Data SummaryLong-read sequencing data is available through National Center for Biotechnology Information under the BioProject PRJNA853172. Complete plasmid sequences have been uploaded to GenBank with accession numbers in supplementary S1. The authors confirm all supporting data, code and protocols have been provided within the article or through supplementary data files.
Lam, M. M. C.; Hamidian, M.
Show abstract
Acinetobacter baumannii is a Gram-negative pathogen responsible for hospital-acquired infections with high levels of antimicrobial resistance (AMR). The spread of multidrug-resistant A. baumannii strains, particularly those resistant to carbapenems, has become a global concern. Spread of AMR in A. baumannii is primarily mediated by the acquisition of AMR genes through mobile genetic elements, such as plasmids. Thus, a comprehensive understanding of the role of different plasmid types in disseminating AMR genes is essential. In this study, we analysed the distribution of plasmid types, sampling sources, geographic locations, and AMR genes carried on A. baumannii plasmids. A collection of 814 complete plasmid entries was collated and analysed. Most plasmids were identified in clinical isolates from East Asia, North America, South Asia, West Europe, and Australia. We previously devised an Acinetobacter Plasmid Typing (APT) scheme where rep/Rep types were defined using 95% nucleotide identity and updated the scheme in this study by adding 13 novel rep/Rep types (93 types total). The APT scheme now includes 178 Rep variants belonging to three families: R1, R3, and RP. R1-type plasmids were mainly associated with global clone 1 strains, while R3-type plasmids were highly diverse and carried a variety of AMR determinants including carbapenem, aminoglycoside and colistin resistance genes. Similarly, RP-type and rep-less plasmids were also identified as important carriers of aminoglycoside and carbapenem resistance genes. This study provides a comprehensive overview of the distribution and characteristics of A. baumannii plasmids, shedding light on their role in the dissemination of AMR genes. The updated APT scheme and novel findings enhance our understanding of the molecular epidemiology of A. baumannii and provide valuable insights for surveillance and control strategies. IMPORTANCEA. baumannii has emerged as a major cause of nosocomial infections, particularly in intensive care units, posing a substantial challenge to patient safety and healthcare systems. Plasmids, which carry antimicrobial resistance (AMR) genes, play a crucial role in the multidrug resistance exhibited by A. baumannii strains, necessitating a comprehensive understanding of plasmid spread, and how to track them. This study provides important insights into A. baumannii plasmid epidemiology, and the extent of their role in spreading clinically significant AMR genes and how they are differentially distributed across different clones i.e. sequence types (STs) and geographical regions. These insights are important for identifying high-risk areas or clones implicated in plasmid transmission, in the context of the spread of multidrug-resistant A. baumannii strains. It also highlights the involvement of R3-type, RP-type and rep-less plasmids in the acquisition and spread of significant AMR genes including those conferring resistance to carbapenems, aminoglycosides and colistin.
Parfitt, K. M.; Pascoe, B.; Jolley, K. A.; Douglas, A.; Goforth, M. P.; Sheppard, S. K.; Maiden, M. C. J.; Colles, F. M.
Show abstract
Campylobacter remains the leading cause of bacterial gastroenteritis worldwide, with C. jejuni accounting for around 90% of infection and C. coli accounting for most of the rest. Seven-locus multilocus sequence typing (MLST) has improved our understanding of host association and population structure, whilst core genome MLST (cgMLST), enables investigation of transmission events at high-resolution. However, the lack of a stable and standardised nomenclature for clustering of cgMLST data has limited reproducibility and long-term comparability between studies. Here we introduce a joint, hierarchical Life Identification Number (LIN) code system that provides reproducible, multi-level genomic identifiers for C. jejuni and C. coli lineages. Using an updated cgMLST v2 scheme (1,142 loci) and globally representative datasets of high-quality genomes selected from over 53,000 assemblies in the Campylobacter PubMLST database (https://pubmlst.org/organisms/campylobacter-jejunicoli), we firstly defined LIN codes on a dataset of 5,664 genomes. Pairwise allelic distances were computed using MSTclust, and 18 nested thresholds were defined through silhouette, adjusted Wallace and adjusted Rand Index (ARI) statistics to capture the population structure from species to outbreak level resolution. The LIN thresholds were then validated using a second dataset of 1,781 genomes from PubMLST and applied to a large water-associated outbreak dataset from New Zealand in 2016, containing clinical and ecological genomes. Further application of LIN codes was demonstrated by analyses of the C. jejuni ST-21 clonal complex and ST-6175 isolates, as well as the broader population structure of C. coli, using data from PubMLST. Across all datasets, LIN clusters were stable, largely monophyletic, and back-compatible with existing nomenclature, accurately distinguishing host-adapted and outbreak-associated lineages. By embedding cgMLST data within a stable and scalable nomenclature, the Campylobacter LIN system delivers consistent, automated genome-to-lineage assignment. This unified framework bridges population genetics and applied surveillance, enabling robust, real-time comparison of Campylobacter isolates across sources, studies, and time. Impact statementHuman cases of Campylobacter worldwide continue unabated. Tracing the source of Campylobacter infection is particularly challenging given the sporadic or multi-source nature of outbreaks, with potential transmission from foodborne, animal or environmental sources. Seven-locus MLST has greatly improved our broad understanding of Campylobacter population structure. However, whilst high-resolution cgMLST alleles and STs themselves do not change, longitudinal cluster analyses of cgMLST data have lacked a stable nomenclature, rendering them unsuitable for robust and comparable surveillance over time. Life Identification Number (LIN) codes provide a solution to this problem, establishing an automated and scalable nomenclature derived directly from cgMLST profiles, that is stable over time. We have implemented a joint C. jejuni and C. coli LIN code scheme in PubMLST, with scripts for real-time lineage assignment. LIN codes are back-compatible with existing MLST nomenclature, and we demonstrate their added practical value for exploring population structure and high-resolution outbreak investigation. LIN codes support surveillance of Campylobacter in a One Health context, by enabling consistent typing at multiple levels across different sources, laboratories and time. Data summary1. The isolate collections used to develop the LIN codes are publicly available and searchable as individual projects on the PubMLST database (https://pubmlst.org). O_LILIN code development (Dataset 1) (n=5,664 isolates, up to 200 isolates per clonal complex) C_LIO_LILIN code validation (Dataset 2) (n=1,781 isolates), up to 50 isolates per clonal complex C_LIO_LIOutbreak investigation (Dataset 3): New Zealand 2016 Havelock North waterborne outbreak, Gilpin et al (n=161 isolates) [1] C_LIO_LIPopulation structure exploration (clonal complex) (Dataset 4): ST-21 complex (n=1800 isolates, up to 100 isolates randomly selected from each country) C_LIO_LIPopulation structure exploration (sequence type) (Dataset 5): ST-6175 (n=321 isolates, genomes with good cgMLST v2 annotation) C_LI 2. The software for LIN code development is publicly available as follows: O_LIMSTclust for pairwise distance matrices https://gitlab.pasteur.fr/GIPhy/MSTclust [2] C_LIO_LIPython script to define LIN codes in a local dataset; (https://gitlab.pasteur.fr/BEBP/LINcoding) C_LIO_LIBIGSdb Perl script to define LIN codes from cgMLST profiles on the PubMLST database; (https://github.com/kjolley/BIGSdb/blob/develop/scripts/maintenance/lincodes.pl) C_LI
Ashcroft, M. M.; Forde, B. M.; Phan, M.-D.; Peters, K. M.; Roberts, L. W.; Chan, K.-G.; Chong, T. M.; Yin, W.-F.; Paterson, D. L.; Walsh, T. R.; Schembri, M. A.; Beatson, S. A.
Show abstract
Escherichia coli Sequence Type (ST)101 is an emerging, multi-drug resistant lineage associated with carbapenem resistance. We recently completed a comprehensive genomics study on mobile genetic elements (MGEs) and their role in blaNDM-1 dissemination within the ST101 lineage. DNA methyltransferases (MTases) are also frequently associated with MGEs, with DNA methylation guiding numerous biological processes including genomic defence against foreign DNA and regulation of gene expression. The availability of Pacific Biosciences Single Molecule Real Time Sequencing data for seven ST101 strains enabled us to investigate the role of DNA methylation on a genome-wide scale (methylome). We defined the methylome of two complete (MS6192 and MS6193) and five draft (MS6194, MS6201, MS6203, MS6204, MS6207) ST101 genomes. Our analysis identified 14 putative MTases and eight N6-methyladenine DNA recognition sites, with one site that has not been described previously. Furthermore, we identified a Type I MTase encoded within a Transposon 7-like Transposon and show its acquisition leads to differences in the methylome between two almost identical isolates. Genomic comparisons with 13 previously published ST101 draft genomes identified variations in MTase distribution, consistent with MGE differences between genomes, highlighting the diversity of active MTases within strains of a single E. coli lineage. It is well established that MGEs can contribute to the evolution of E. coli due to their virulence and resistance gene repertoires. This study emphasises the potential for mobile genetic elements to also enable highly similar bacterial strains to rapidly acquire genome-wide functional differences via changes to the methylome. Impact StatementEscherichia coli ST101 is an emerging human pathogen frequently associated with carbapenem resistance. E. coli ST101 strains carry numerous mobile genetic elements that encode virulence determinants, antimicrobial resistance, and DNA methyltransferases (MTases). In this study we provide the first comprehensive analysis of the genome-wide complement of DNA methylation (methylome) in seven E. coli ST101 genomes. We identified a Transposon carrying a Type I restriction modification system that may lead to functional differences between two almost identical genomes and showed how small recombination events at a single genomic region can lead to global methylome changes across the lineage. We also showed that the distribution of MTases throughout the ST101 lineage was consistent with the presence or absence of mobile genetic elements on which they are encoded. This study shows the diversity of MTases within a single bacterial lineage and shows how strain and lineage-specific methylomes may drive host adaptation. Data SummarySequence data including reads, assemblies and motif summaries have previously been submitted to the National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov) under the BioProject Accessions: PRJNA580334, PRJNA580336, PRJNA580337, PRJNA580338, PRJNA580339, PRJNA580341 and PRJNA580340 for MS6192, MS6193, MS6194, MS6201, MS6203, MS6204 and MS6207 respectively. All supporting data, code, accessions, and protocols have been provided within the article or through supplementary data files.
Lean, S. S.; Morris, D. E.; Anderson, R.; Alattraqchi, A. G.; Cleary, D. W.; Clarke, S.; Yeo, C. C.
Show abstract
Acinetobacter baumannii is widely recognized as a multidrug-resistant pathogen, but with most public genome datasets biased toward hospital-derived isolates. Little is thus known about A. baumannii isolates from healthy individuals from the community. This study analysed genome sequences from nine A. baumannii isolates obtained from the upper respiratory tract of healthy individuals of the indigenous Orang Asli community in their rural settlements located in the east coast of Peninsular Malaysia. Genomic analysis revealed all nine A. baumannii isolates to be genetically distinct and included three new sequence types (STs) under the Pasteur scheme and six novel STs under the Oxford scheme. Notably, one isolate, A. baumannii 19064, belonged to Global Clone 8 (GC8), a lineage associated with clinical infections. Core genome phylogeny indicated that the Orang Asli community isolates were interleaved with several non-GC clinical isolates from the tertiary hospital within the same state. This is suggestive of the potential of these community isolates to cause infections under conducive conditions. This is also attested by the identification of several virulence factors in their genomes. All nine isolates carried various intrinsic blaOXA-51-like class D {beta}-lactamase genes and class C Acinetobacter-derived cephalosporinase (ADC) blaADC genes but remained susceptible to meropenem. Two isolates, A. baumannii 19053 and 19062, were tetracycline resistant but minocycline susceptible and harboured the tet(39)-tetR gene pair within mobile pdif modules located on distinct Rep_3 family plasmids. Only one isolate, A. baumannii 19055, is plasmid-free; the rest mainly harboured cryptic Rep_3-type plasmids, often containing identifiable pdif modules. These findings highlight the clinical relevance of A. baumannii strains residing in healthy individuals, particularly in isolated communities that are seldom accessible to public health. Despite their remote location, the Orang Asli A. baumannii isolates possess virulence factors and antibiotic resistance genes similar to those found in hospital settings. This underscores the importance of genomic surveillance of commensal pathogens, and taking this road which is less travelled can help inform broader epidemiological insights and guide future public health strategies.